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Q ' Abstract 

^ ^ I In this paper we present algorithms for a number of problems in geometric pattern matching where 

Q^. the input consist of a collections of segments in the plane. Our work consists of two main parts. In 

D ' the first, we address problems and measures that relate to collections of orthogonal line segments in the 

C/J I plane. Such collections arise naturally from problems in mapping buildings and robot exploration. 

^\i ■ We propose a new measure of segment similarity called a coverage measure, and present efficient 

^S| ' algorithms for maximising this measure between sets of axis-parallel segments under translations. Our 

algorithms run in time 0{n^polylogn) in the general case, and run in time 0{n^polylogn) for the case 

I""!' when all segments are horizontal. In addition, we show that when restricted to translations that are 

\^ ' only vertical, the Hausdorff distance between two sets of horizontal segments can be computed in time 

^^ . roughly 0{rv^^^ polylog 12). These algorithms form significant improvements over the general algorithm 

jyp^ I of Chew et al. that takes time 0(n* log^ n). 

CJ . In the second part of this paper we address the problem of matching polygonal chains. We study the 

well known Frechet distance , and present the first algorithm for computing the Frechet distance under 
general translations. Our methods also yield algorithms for computing a generalization of the Frechet 
distance, and we also present a simple approximation algorithm for the Frechet distance that runs in 
time 0{n^polylogn). 
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Introduction 



1 Introduction 



Traditionally, geometric pattern matching employs as a measure of similarity the Hausdorff distance h(A,B), 
defined as h{A, B) = maxpg^ miiigg^ d{p, q) for two point sets A and B. However, when the patterns to 
be matched are line segments or curves (instead of points), this measure is less than satisfactory. It has been 
observed that measures like the Hausdorff measure that are defined on point sets are ill-suited as measures 
of curve similarity, because they destroy the continuity inherent in continuous curves. 

This paper addresses problems in geometric pattern matching where the inputs are sets of line segments. 
Our work consists of two main parts; in the first part we consider the problem of matching (under translation) 
segments that are axis-parallel (i.e either horizontal or vertical), and in the second we consider the problem 
of matching polygonal chains under translation. We study two different measures in this context; the first is 
a novel measure called the coverage measure, which captures the similarity between orthogonal segments 
that may partially overlap with one another. The other is the well known Frechet distance, first proposed by 
Maurice Frechet in 1906 as a measure of distance between distributions, which has often been referred to as 
a natural measure of curve similarity [|3, 16, ^]. We discuss each measure in detail below. 



1.1 Mapping and orthogonality 

The motivation for considering instances of pattern matching where the input line segments are orthogonal 
comes from the domain of mapping, in which a robot is required to map the underlying structure of a 
building by moving inside the building, and "sensing" or "studying" its environment. 

In one such mapping project at the Stanford Robotics laboratory^ the robot is equipped with a laser 
range finder which supplies the distance from the robot to its nearest neighbor in a dense set of directions 
in a horizontal plane. We call the resulting distances map a picture. Figure |I](a) shows the robot used at 
Stanford for this purpose the laser range finder installed on the robot. 

During the mapping process, the robot must merge into a single map the series of pictures that it captures 
from different locations in the building. 




(a) 



(b) 



(c) 



Figure 1: Left: The robot, and the laser range finder installed on it. Middle: Typical "picture" obtained by 
the robot of a corridor (after segmentation). Right: The corridor itself 



Since the dead reckoning of the robot is not very accurate, it cannot rely solely on its motion to decide how 
the pictures are placed together. Thus, we need a matching process that can align (by using overlapping 
regions) the different pictures taken from different points of the same environment. In addition, we need to 
determine whether the robot has returned to a point already visited. We make the reasonable assumption 
that buildings walls are almost always either orthogonal or parallel to each other, and that these walls are 

' The interested reader can find more information at tiie URL underdog.stanford.edu 
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frequently by far the most dominant objects in the pictured. This is especially significant in the case that the 
robot is inside a corridor, where there is a lack of detail needed for good registration. In some cases most 
of the picture consists merely of two walls with a small number of other segments. See Figure [|(b),(c) for a 
typical picture and the real region that the laser range finder senses. 

This application suggests the study of matching sets of horizontal and vertical segments. Observe that 
we may restrict ourself to alignments under translation, as it is easy to find the correct rotation for matching 
sets of orthogonal segments. Formally, let A = {oi . . . a„} and B = {bi, . . . 6„} be two sets of orthogonal 
line segments in the plane, and let e be a given parameter. A point p of a horizontal (resp. vertical ) segment 
a G ^ is covered if there is a point of a horizontal (resp. vertical) segment h ^ B whose distance from 
p is < e, where the distance is measured using the ^oo norm. Let w{A,B) denote the collection of sub- 
segments of A consisting of covered points. Let Cov{A, B) be the total length of the segments of w{A, B). 
The maximum coverage problem is to find a translation t* in the translation plane {TP) that maximizes 
Cov{t) = Cov{t + A, B). To the best of our knowledge, this measure is novel. 

The coverage measure is especially relevant in the case of long segments e.g. inside a corridor, when we 
might be interested in partially matching portions of long segments to portions of other segments. 

Our Results In Section we present an algorithm that solves the Coverage problem between sets of 
axis-parallel segments in time 0{n^ log^ n) and the Coverage problem between horizontal segments in time 
0{v? log n) Note that the known algorithms for matching arbitrary sets of line segments are much slower. 
For example, the best known algorithm for finding a translation that minimizes the Hausdorff Distance 
between two sets of n segments in the plane runs in time 0{n^ log^ n) [^ ^. We also show that the that the 
combinatorial complexity of the Hausdorff matching between segments is Q{n'^), even if all segments are 



horizontal. This strengthens the bounds shown by Rucklidge |14], and demonstrates that our algorithms, 
much like the algorithms of [^ |8]] are able to avoid having to examine each cell of F individually. Note that 
all our results extend to the case when segments are weighted and the coverage is now a weighted sum of 
interval lengths. 

In Section Section |3| we consider the related problem of matching horizontal segments under vertical 
translations (under the Hausdorff measure). It has been observed that if horizontal translations are allowed, 
then this problem is 3SUM-hard [||], indicating that finding a sub-quadratic algorithm may be hard. How- 
ever, we present an algorithm running in time 0{'n?/'^ maxjlog'^ Af , log'^ n, 1/e^))}, for some fixed constant 
c, which is sub-quadratic in most cases. Here, M denotes the ratio of the diameter to the closest pair of 
points in the sets of segments (where pairs of points must lie on different segments). 

1.2 The Frechet distance 

In the second part of the paper, we consider measures for matching polygonal chains under the Frechet 
distance. Let us define a curve as a continuous mapping P : [a, a'] — > M^. The Frechet distance between 
two curves P and Q, dpiP, Q) is defined as: 

di.(P,Q) =inf max ||/(a(t)) -5(«(t)) II 

a,/3te[0,l] 

where a, /? range over continuous increasing functions from [0,1] -^ [a, a'] and [0,1] — > [6, b'] respectively. 

Alt and Godau proposed the first algorithm for computing the Frechet distance between two polygonal 

chains (with no transformations). Their method is elegant and simple, and runs in time 0{pq), where p and q 



are the number of segments in the two polygonal chains. In his Ph.D thesis [20]. Michael Godau presents an 
extensive study of the complexity of computing the Frechet distance. He shows that computing the Frechet 
distance between two simplicial objects is NP-hard, for any dimension d > 3. 

Although the Frechet distance is a natural measure for curve similarity, its applicability has been limited 
by the fact that no algorithms exist to minimise the Frechet distance between curves under various transfor- 
mation groups. Prior to our work, the only result on computing the Frechet distance under transformations 
was presented by Venkatasubramanian [^]. He computes minigj-p^ dpiP, Q + t) < e, where TPx is the 
set of translations along a fixed direction, in time O (n^ poly log n) (where n = p + q). In fact, our methods 
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can be viewed as a generalization of his methods and can be used to solve his problem in the same time 
bound. 

Our Results In Section ^ we present the first algorithm for computing the Frechet distance between two 
polygonal chains minimized under translationsF]. The algorithm is based on a reduction to a dynamic graph 
reachability problem; its running time is 0{n polylogn). 

If we drop the restriction that the functions a, /5 must be increasing, we obtain a measure that we call the 
weak Frechet distance , denoted by dp. Our methods can be used to decide whether ramteTP dp{P, Q+t) < 
e; in this case, the underlying graph is undirected, yielding an algorithm that runs in time 0{n^ polylogn). 

With the exact algorithms being rather expensive, it is natural to ask whether approximations can be 
obtained efficiently. A simple observation shows that we can obtain an (e, /3) -approximation to the Frechet 
distance under translations in time 0{n'^ poly {log n, 1//3)). 

2 Maximum Coverage Among Sets Of Segments 

Let A = {ai . . . an} and B = {61, . . . 6„} be two sets of axis-parallel line segments in the plane, and let e 
be a given parameter. Recall the coverage measure Cov{A, B) as defined in the introduction. 

2.1 Computing coverage with axis-parallel segments 

We first consider the case that the sets A and B consists of both horizontal and vertical segments. Let A^ 
(resp. B^) be a set of n horizontal segments and let A^ (resp. B^) be a set of n vertical segments. Let e be 
a given parameter. Let A = A^ \J A'' and let B = B^ \J B" . Let Cov{t + A,B) = Cov{t + A^, B^) + 
Cov{t + A'',B''). 

We first need the following lemma, whose proof is deferred to Appendix 0. Let S = {si...Sm}bea 
set of non- vertical segments in R^. For each segment Sj E 5 we define the functions Si{x) ^ M as follows: 
For every x G M, Sj(x) is the y-coordinate of the intersection point of s and the vertical line passing through 
X, if such an intersection point exists. We set Si{x) to be otherwise. Let sums{x) = Ti^L^Si{x), and let 
max(sum5(-)) = max^gi{SUiii5(x). Furthermore, let T = T{t) be a subset of 5 consisting of horizontal 
segments that can move vertically at constant speed i.e the y-coordinates of the endpoints of each Sj G T 
are given by y = ajT + 6j. 

Lemma 2.1 Given a set of non-vertical segments S with a subset T of horizontal moving segments, we can 
maintain max(suiTi5(-)) under segment insertions or deletions in amortized time 0{\/\S\) per operation. 
In addition, we can maintain max(suiii5(-)) under a time-decreasing step (t ^ t — A) in 0(1) time. 



Theorem 2.2 We can find a translation t that maximizes Cov{t + A, B) in time 0{n^ log n), where n = 

\A\ + \B\ 

Proof: The proposed algorithm is a line-sweep algorithm, with the sweep line moving from top to bottom. 
For a segment 6j G -B let bf denote the rectangle consisting of all points whose iiufty distance from hi is 
at most e. Let B^ denote the union IJ"^^ bf. Note that any two rectangles bf, b^ intersect in at most two 
points, so by [ p^ ] the complexity of the boundary of B~^ is 0{n). Consider E = {pi . . . P2n}y the set of the 
2n endpoints of the segments of A. Define the layer Li = B^ — pi, which is the region in the TP of all 
translations t that shift pi into B^ i.e i + pj G B^ . Let B^ (resp. B"") be the collection of layers created by 
the horizontal (resp. vertical) segments of A. As the line sweep traverses the translation plane from top to 
bottom, we encounter events where d, intersects a horizontal boundary segment of either B'^ or B"" . 
Horizontal Boundaries Of B^: Let Cov{x) : M ^ R be the value of Cov{t + A^^ B^), where t is the point 
on i. vertically above x. Consider the contribution to Cov{t + A^,B^) from the interaction between the 

^Actually, we solve the decision version of the problem: For a given e, determine whether mintgTP dp {P, Q + t) < e. 
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segments a (^ A^,b (^ B^. This contribution to the function consists of a piecewise Hnear function, consists 
of five segments: It is zero for value of x which are very far from the regions of interaction between a and b, 
it is a constant that equals the minimum of the length of a and b when x is near the region of intersection, and 
it consists of two segments of slopes are 1 and —1, connecting these segments. These segments exist for all 
instances of the line sweep where its horizontal distance to the boundary of the rectangle of Bj corrsponds 
to Oj is < e. There are 0{'n?) update operations, and each update can be processed in 0{n log^ n) time from 
Lemma Lemma [2.1[ 

Horizontal Boundaries Of B"". For two vertical segments ai G A, bj G B, let Tij be the set of translations 
for which the horizontal distance from Oj to bj is at most e. Assume w.l.o.g that |aj| > \bj\. Let UPij 
denote all translations t for which the upper endpoint of aj is covered by t + bj, (i.e. its distance from some 
point of i + bj is at most e) but the lower endpoint of a, is not covered. Similarly, let DOWNij denote all 
translations t for which the lower endpoint of ai is covered by t + bj, but the upper endpoint of Oj is not 
covered and let MIDij denote all translations t for which both endpoints of aj are covered. 

Thus Cov{t + tti, bj) is zero when t ^ UPij U MIDij U DOWNij, Cov{t + Cj, bj) is a constant when 
t G MIDij, and it is a decreasing (resp. increasing) linear function that depends only on the y-coordinate 
of t when t G UPij (resp. t G DOWNij). Therefore, we can represent the contribution of Oj and bj to 
Cov{ai,t + bj) by a horizontal segment Uij{T) of length 2e that starts at y = and moves upwards with 
constant velocity as the line sweep intersects DOWNij. It remains constant at a maximum height as i 
passes thru MIDij and moves downwards to as ^ passes through UPij. 



This suggests the following operations on the data structures, using Lemma |2.1| . Consider the rectangle 
bj of the vertical decompostion of Li, (which corresponds to translations for which Oj is in the vicinity of 
bj). We divide bj into three rectangles bij^jjp, bij^MiD and bij^oowN, which are the intersection regions 
of bj and UPij, MIDij and DOWNij. As the linesweep hits the upper boundary of a rectangle bij^up, 
we insert the moving segment Uij{T) into T{t). When i reaches the upper boundary of bij^MiD we insert 
a horizontal moving segment u'-j{t) chosen such that that Uij{T) + u'^At) equals Maxij. This is done in 
order to avoid deleting or changing Uij^r). When £ reaches the upper boundary of bij^owN^ we insert 
into T{t) the segment u'-Jt) which is also decreases linearly as r decreases, and is choosen such that 
u{T)ij + u'{T)ij + u'IAt) equals Cov{ai,t + bj) at this translation t, t G DOWNij. Overall, we add three 
(moving) segments for each rectangles of Li, and since the number of these rectangles is 0{'n?), it follows 
that the overall running time of the algorithm is 0(n^ log^ n). Note also that at each update, we decrease 
the current "time" r; this is a constant time operation per update. * 

2.2 Maximum coverage for horizontal segments 

This is a line-sweep algorithm reminiscient of the Chew-Kedem ^ and Chew et al. [^] algorithm for 
computing the similarity between point-sets in the plane, under the ioo norm. As in Section [2.1| , we define 
layers Li for each endpoint pi of segments in A. Construct a horizontal decomposition of Li, breaking it 
into a collection Bi = {/3ji j3i2 . . .} of 0{n) interior-disjoint rectangles. 

Let S denote the set of vertical segments on the boundaries of the layers Li (for i = 1 . . . 2n). Let T be 
a segment tree constructed on the segments of S. During the algorithm we sweep the translation plane TP 
using a vertical sweep line (.. Once i meets a segment e G 5, we insert e into T . No segment is deleted. 

Let ^ be a node of T. Let I^ be the horizontal infinite strip whose y-span is the interval of ^ and let 
S^ Q S denote the segments on or to the left of i which correspond to ji i.e. the segments whose y-span 
contains I^ but not Ifather(p.)- We maintain the following fields with each node ji of T. All of these are set 
to zero at the beginning of the algorithm: 

• last^: the last x event at which a segment was inserted into S^. 

• Pos^: the number of segments in S^ resulting from the right (resp. left) endpoint of a segment a ^ A 
meeting a left (resp. right) vertical segment of some layer. We call such an event a Positive event 

• Neg' the number of segments in 5^ resulting from the left (resp. right) endpoint of a segment a G A 
meeting a left (resp. right) vertical segment of some layer. We call such an event a Negative event. 
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• w^: The maximal coverage obtained by segments stored at S^ itself. 

• Cov^: The maximal coverage obtained by events of "segments" stored at the descendants nodes of fi 
including ^ itself. 

Performing an insertion: Once £ hits a new segment s G 5, we first find all nodes // for which s G S*^ as 
in a standard segment tree. Next, for each such node /i, we increase either Pos^ or Neg by one, according 
to the type of s. Next we add to w^ the quantity {Pos^ — Neg^)d, where d is the horizontal distance from 
the previous insertion event into S^, (stored at last^) till the current position of the i. We update Cov^j^ for 
each ^ in bottom-up fashion, namely: Cov^j^ = max{Cov;gjj(^), CoVrjj^j(^)} + w^. Each insertion can be 
performed in 0(log n) time, so the overall running time of the algorithm is 0{n? log n). When the algorithm 
terminates, we report a translation toutput that corresponds to the maximum value of CoVj.^yi^in-\ obtained by 
the algorithm. 

Remark: The algorithm can easily be modified to handle the weighted case, where each segment has a 
weight, and the contribution to the coverage of a segment is the length of the covered portions times the 
weight of the segment. This is useful when some segments are more important than others. 

Theorem 2.3 Let t* G TP be the leftmost translation that maximises Cov{t + A^ B). Then when the 
line-sweep passes through t*, {t* + A,B) = CoVj.oot(T)- 

Proof: We first make the following observation. Consider the infinite horizontal ray r emerging from t* 
to the left. Let xi . . .xibe the x-coordinates of the events encountered along this ray, ordered from left to 
right. Let Posi (resp. Neg^) be defined as the number of positive intersection points of r to the left of Xj, 
with boundaries of layers that corresponds to positive (resp. negative) events, as described above. Clearly 

Cov{t*A, B) = ^i=iiPosi - Negi){xi - Xi^i) (1) 

On the other hand, the sum of the right hand side of (|l|) equals the sum of the fields w^, taken over all 
nodes ji of the segment tree on the path from the root to the leaf node containing t*, at the instance when the 
line sweep intersects t* . This follows from the fact that each event Xi is also an event in one of the nodes n 
along this path. Therefore this sum equals Coy^(,i^irr\ , since the sum of the fields w^ along every path from 
the root to a leaf equals Cov{t + A, B) at any translation t stored at that leaf, and t* by our assumption is 
maximal. ■ 

2.3 A lower bound 

Rucklidge [|l^ showed that given a parameter e and two families A and B of segments in the plane, the 
combinatorial complexity of the regions in the translations plane (TP ) of all translations t for which h{t + 
A, B) < e is in the worst case Q{n^), where h{A, B) is the one way Hausdorff distance from A to B. We 
show that the Q.{n'^) bound holds even in the case that all segments are horizontal (the proof is deferred to 
Appendix ^. This implies: 

Theorem 2.4 The region of all translations tfor which Cov{A, t + B) is maximal has combinatorial com- 
plexity Q.{n^). 

3 Matching Horizontal Segments Under Vertical Translation 

In this section we describe a sub-quadratic algorithm for the Hausdorff matching between sets A and B of 
horizontal segment, when translations are restricted to the vertical direction. 

Let p* = minj h{t + A, h) where t varies over all vertical translations, and /i(-, •) is the one-way Haus- 
dorff distance. Let M denote the ratio of the diameter to the closest pair of segments in A U i?. Further, let 
[M] denote the set of integers {1 . . . M}. 
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Theorem 3.1 Let A and B be two set of horizontal segments, and let e < 1 be a given parameter. Then we 
can find a vertical translation tfor which h{t + A, B) < (1 + e)p* in time 0{n^'^ poly (log M, log n, 1/e)). 

We first relate our problem to a problem in string matching: 

Definition 3.2 (Interval matching): given two sequences t = t[\] . . . t[n] and p = p[l] . . . p[m\, such that 

p[i] G [M] and t[i] is a union of disjoint intervals {aj . . . bj} U {af . . . bf} . . . with endpoints in [M], find 
all translations j such that p[j] ^ t[i + j] for all i. The size of the input to this problem is defined as 

s = sumi\t[i]\ + m. 

We also define the sparse interval matching problem, in which both p[i] and t[i] are allowed to be equal 
to a special empty set symbol 0, which matches any other symbol or set. The size s in this case is defined as 
sumi|t[i]| plus the number of non-empty pattern symbols. Using standard discretization techniques [^, [TTI ], 
we can show that the problem of (1 + e)-approximating the minimum Hausdorff distance between two sets 
of n horizontal intervals with coordinates from [M] under vertical motion can be reduced to solving an 
instance of sparse interval matching with size s = 0{n). 

Having thus reduced the problem of matching segments to an instance of sparse interval matching, we 
show that: 

• The (non-sparse) interval matching problem can be solved in time 0{s^'^potylogs). 

• The same holds even if the pattern is allowed to consists of unions of intervals. 

• The sparse interval matching problem of size s can be reduced to 0(log M) non-sparse interval matching 
problems, each of size s' = 0{s polylogs). 



These three observations yield the proof of Theorem |3.1| . In the remainder of this section, we sketch 
proofs of the above observations. 

The interval matching problem. Our method follows the approach of [jl], |13|] and [Q] ; therefore, we sketch 
the algorithm here, omitting detailed proofs of correctness. 

Firstly, we observe that the universe size M can be reduced to 0{s), by sorting the coordinates of 
the points/interval endpoints and replacing them by their rank, which clearly does not change the solution. 
Then we reduce the universe further to M' = 0{^/s) by merging some coordinates, i.e. replacing several 
coordinates xi . . . Xk hy one symbol {xi . . . Xk], in the following way. Each coordinate (say x) which 
occurs more than ^/s times in t or p is replaced by a singleton set {x} (clearly, there are at most 0{^/s) 
such coordinates). By removing those coordinates, the interval [M] is split into at most 0{y/s) intervals. 
We partition each interval into smaller intervals, such that the sum of all occurrences of all coordinates in 
each interval is 0{^/s). Clearly, the total number of intervals obtained in this way is ^/s. Finally, we replace 
all coordinates in an interval by one (new) symbol from [M'] where M' = 0{y/s). By replacing each 
coordinate x in p and t by the number of a set to which x belongs, we obtain a "coarse representation" of 
the input, which we denote by p' and t' . 

In the next phase, we solve the interval matching problem for p' and t' in time 0{nM') using a Fast 
Fourier Transform-based algorithm (see the above references for details). Thus we exclude all translations 
j for which there is i such that p[i\ is not included in the approximation of t[i + j]. However, it could be still 
true that p[i] ^ i[i + j] while p'[{\ G t'[i + j]. Fortunately, the total number of such pairs {i,j) is bounded by 
the number of new symbols (i.e. M') times the number of pairs of all occurrences of any two (old) symbols 
corresponding to a given new symbol (i.e. 0{^/s )). This gives a total of 0{s^^'^) pairs to check. Each 
check can be done in 0(log n) time, since we can build a data structure over each set of intervals t[i] which 
enables fast membership query. Therefore, the total time need for this phase of the algorithm is 0{s^^'^), 
which is also a bound for the total running time. 

The generalization to the case where p[i\ is a union of intervals follows in essentially the same way, so 
we skip the description here. 

The sparse-to-non-sparse reduction. The idea here is to map the input sequences to sequences of length 
P, where P is a random prime number from the range {c\s log M . . . C2S log M} for some constants ci , C2. 
The new sequences p' and t' are defined as p'[i] = Uj/:j/modP=iP[^'] and t'[i] = '^i':i'modP=i *[^']- It can be 
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shown (using similar ideas as in ||^) that if a translation j does not result in a match between p and t, it will 
remain a mismatch between p' and t' with constant probability. Therefore, all possible mismatches will be 
detected with high probability by performing 0(log M) mappings modulo a random prime. 

4 Computing The Frechet Distance Under Translation 

In this section, we present algorithms for computing the Frechet distance between two polygonal chains. 
Recall that the Frechet distance between two curves P and Q, dpiP, Q) is defined as: 

dF{P,Q)=lni max \\f{a{t)) - g{a{t))\\ 

a,/3te[0,l] 

where a, f5 range over continuous increasing functions from [0, 1] -^ [a, a'] and [0, 1] — > [6, b'] respectively. 

Dropping the restriction that a, 13 we. increasing functions yields a measure we call the weak Frechet 
distance, denoted by dp. It can be easily seen that both dp and dp are metrics. 

Let the curves P and Q be length-parameterized by r, s. In other words, P = P{r),Q = Q{s), where 
< r, s < 1. For any fixed e, let Fe(P, Q), the, free space, be defined as 

F,{P,Q) = {{r,s)\ ||P(r)-Q(s)||<e} 

where || • || is the underlying norm|^ The free space captures the space of parameterizations that achieve a 
Frechet distance of at most e. In the sequel we will denote the free space by F^ when the parameters P and 
Q are clear from the context. 

Let a polygonal chain P : [0, n] ^ M^ be a curve such that for each i £ {0, . . . ,n—l}, Pi tj+ii is affine 
i.e P{i + A) = (1 - X)P{i) + XP{i + 1), < A < 1. For such a chain P, denote \P\ = n. Let Pi denote 
the segment i-][j j+ij. For two polygonal chains P, Q where |P| = p, \Q\ = q, and a fixed e, the free space 
Fg C [0,p] X [0, q] is given (as before) by: 

F,{P,Q) = {{r,s)\ ||P(r)-Q(s)||<e} 

Let F^^ = FsD {Pi xQj). Observe that F^^ = Fe{Pi, Qj). It can be seen ^ that F^^ is the affine 
inverse of a unit ball with respect to the underlying norm. Consequently, F^^ is convex. 

Consider the points of intersection of a single cell Cij = F^^ with the line segment from {i,j) to 
{i,j + 1). Since Cij is convex, there are at most two such points, which we denote as aij, bij, where aij is 
below bij. Similarly, let Cij and dij be the points of intersection of Cij with the line segment from (i, j) to 
(i + 1, j), where Cij is to the left of dij. 



(i,j+l) a, d. (i+l,j+l) 




Figure 2: A single cell in the free space 

We define an order on the points as follows: For any two points pi = (xi, yi),p2 = {x2,y2), Pi < P2 if 
xi < X2 and yi < y2- 

Let an (x, y)-monotone path be a path that is increasing in both x and y coordinates. Alt and Godau [^] 
observed that the existence of a (x, y)-monotone path in F^ from (0, 0) to (p, q) is a necessary and sufficient 

''in this section, we will consider the I2 norm unless otherwise specified. 
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condition for dpiP^Q) < e. A similar property holds for dp,; namely, the existence of any non-self- 
intersecting path in F^ from (0,0) to (p, g) implies that dp{P,Q) < e. Denote the property '\p,q) is 
reachable from (0, 0)" as property V (similarly define V). 

We wish to solve a decision problem for the Frechet distance between P and Q minimised over transla- 
tions i.e given e, we wish to check whether minj dpiP, Q + t) < e 

The configuration space A critical event is one that can change the truth value of V. Each such event 
is one of the following two types: (1) The intersection points aij, bij, Cij,dij appear (or disappear). (2) For 
two cells C-ij and Ckj, k > i, a-ij and akj (or bkj) change their relative vertical ordering. Analogously, for 
two cells Cij and Cj^, k > j the points Cij and Cj^ (or di^) change their relative horizontal ordering. 

Type 2 events correspond to the creation or deletion of tunnels. For any point r in the space [0, p] x [j, j + 
1], let k be the rightmost interval such that r projected onto the interval [a^j, b^j] lies between the endpoints 
of the interval. We define rt (r) = k. For any point r G [i, i + 1] x [0, q], let k be the topmost interval such 
that r projected onto the interval [cik, djfc] lies between the endpoints of the interval. We defineQM?(r) = k. 

As Q translates, each of the Xij,x G {a, b, a, d} can be represented as a function Xij{t) : Mp -^ [0, 1]. 

Proposition 4.1 For a point xij, the function Xijit) is a second degree polynomial in the coordinates oft. 

From free space to a graph Our algorithm for computing dp {P, Q) is based on a reduction of the problem 
to a directed graph reachability problem. Intuitively, we can think of a monotone path in the free space as 
a path in a directed graph (actually a DAG). The advantage of this approach is that we can exploit known 
methods for maintaining graph properties dynamically in an efficient manner. Thus, as we traverse the space 
of translations, we need not recompute the free space at each critical event. 

Let V = U.jlt'fj, 4' ^'Jj' 4} ^"^^ ^ = Uij,i<fe<p{*?,fc' %k} U \Ji,j,j<k<q{t%k^ tijk) where < i < p 
and Q < j < q. The vertices in F U T are associated with points of the free space. More precisely, vertex uf 

is associated with the point Xij (where x is one of {a, b, a, d}). Vertex t^.j^ is associated with the projection 

of point Xij onto the interval [a^j, bkj] {x G {a, b}), and vertex t^-j^ is associated with the projection of point 

Hij onto the interval [cik, dik] (y G {c, d}). We define f{v) = p, where p is the point associated with vertex 

V. 

Let yA = Kj-,4} U \Ji<i<rtia,,)tfji U [ji<i<rtik,)t'iji and V^^j = {vfj,vfj} U Ui<j<utica)iuj U 
Ui<j<ut(di ) tfij- ^ij denotes the set of vertices associated with points on the line segment from (i, j) to 
(i, j + 1). Similarly, V^j denotes the set of vertices associated with points on the line segment from (z, j) to 
{i + 1, j). In addition, V^j and V^^ contain vertices associated with points whose tunnels cross the cell Cij. 

We now describe the construction of the edge set for each (i, j). Firstly, set E], = {(u,uf ) | u G V^H 
and set E];j = {{v^vfj) \ v G V^] For each v G Vij, let n{v) = argmin^,gyi .j(v')>v /(^') Similarly, 
for each v G V^, let n{v) denote the vertex in l^A_|_i having the same property. Let £■? = {{v, n{v)) \ v G 
Vlj U V^j}. Finally, set Ef^ = {(4, <,+i), (4, <+i,,)}. Now, we set Eij = Ejj U ^ U ^ U Efj. 

Let E = \Jij Eij. This yields the directed graph G = {VUT, E). Note that |y U T| = 0{pq{p + q)) 
and \E\ = 0{pq{p + q)). Also, it is easy to see that for any edge {u, v) G E, the straight line from /(n) 
to f{v) is an (x, y)-monotone path. We first show that reachability in the graph G is equivalent to path 
construction in F^. The proof of this theorem is straightforward and is deferred to Appendix ^ 

Theorem 4.2 An {x,y) -monotone path from (0,0) to {p,q) exists in F^ iff Vpq is reachable from Vqq and 

/(%%) = (0,0),/«) = (p,g). 

For every edge e € E, let 7(e) C M? be the set of translations t such that in the graph G constructed 



"'The term rt denotes a right tunnel; ut denotes an upper tunnel. 
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from the free space ^^(P, Q + t), the edge e is present. Let F be the arrangement of all the 7(e). We first 
establish a bound on the complexity of F. 



The following three propositions (which we state without proof), follow from Proposition fl-.l| . Roughly 
speaking, with each edge e we can associate a boolean combination of predicates Pi,P2, . . . ,Pk, where 
each predicate compares some constant degree polynomial to zero, (i.e the regions are semi-algebraic sets). 

• For any region 7(e), the boundaries consist of segments of curves described by constant degree polyno- 
mials. 

• For an edge e G Eij — TxT, the region 7(e) is a constant number of simple regions of constant description 
complexity. 

• For an edge of the form (tf,^, *f,fc+i)) ^ ^ {^! ^) c, d}, the region 7(e) consists of a set of simple regions 
of total description complexity k. 

Lemma 4.3 |r| = 0{p^q^{p + q)^). 

Proof Sketch: There are 0{pq{p + q)) edges. For each edge e, the complexity of the associated region can 
be at most 0{p + q). Since any pair of constant degree polynomials intersect in a constant number of points, 
the overall complexity of F is given by {pq{p + q) x {p + q)Y- B 

Lemma 4.4 Let 7^ = 'y{{tfji^, tfjj^^i)), where x S {a, 6, c, d}. Then for all I such that i <l < k, ^k ^li- 

Proof: Whenever the edge {t^jj^, tijk+i) i^ present, all edges of the form {tf, i,tf-i_^-^),i < I < k must also 
be present. ■ 



Theorem i.2 indicates that the graph property that we need to maintain is the reachability of Vp„ from 



Vqq. The algorithm is now as follows: Fix a traversal of the arrangement of regions. Check reachability at 
the starting cell. Each time an edge is crossed in the traversal, it corresponds to the deletion (and insertion) 
of edges in the graph, which we use to update the graph and check for reachability. Stop whenever the above 
property holds, returning YES, else return NO. 

Theorem 4.5 Iff there exists a translation t such that dpiP, Q + t) < e, the above algorithm will terminate 
with a YES. 

Proof: Consider a type 1 critical event, where the interval aij , bij is created. This interval corresponds to the 
edge {vfpV^A. Hence, this event corresponds to entering the region associated with the above edge. Similar 
arguments hold for other type 1 critical events. 

Suppose we have a type 2 critical event, where the point a^j rises above aij (in their relative vertical 
ordering). Note that this event does not change the reachability of {p, q) in the free space unless rt(ajj) > 
k. If this is the case, then the event results in setting rt(ajj) = k, implying that all edges of the form 
i^iji^^ij i+i)' ^ — ^ ^e deleted, which corresponds to leaving the regions corresponding to this set of edges[|. 

Conversely, it can be seen that any transition from one cell of the arrangement to another corresponds to 
a critical event. We defer the details to a full version of the paper. ■ 

It now remains to analyse the complexity of the above algorithm. A transition between cells yields 0(1) 
updates, except in the case described in Theorem ^^ above, where a transition occurs across the boundary 
of region r((t°. i_-^^, t1-j)) into the region r{{t'!^- k-i^'^'ijk))' causing B(/ — k) updates. However, note that in 
this event, it must be the case that all the regions r{{t1j „, t^ m+i)' k <m < l — l intersect at this transition 
point (from Lemma p^, and thus the cost of this transition can be distributed among these cells. Hence, the 
total number of updates is given by Lemma [4.3| . 

To determine reachability, we must now traverse the arrangement. For ease of notation, we will assume 
that p = Q{q) and set n = p + g. The arrangement consists of 0{n^) regions, each described by 0{n) 

^Note that since the regions corresponding to this set of edges are nested (by Lemma |4.4|), such a transition is indeed possible. 
In fact, the existence of such a critical point implies that all of these regions intersect in at least one point that is also contained in 
r ( (t" J. _ 1 , i" fc ) ) . The critical event can be interpreted as the result of the translation across this point. 
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curves of constant description complexity. Let us fix r (we will specify the value of r later). It can be shown 
(using the theory of cuttings [ jl7| , [l9| ]) that we can compute a subset IZ of the regions of size 0{r log r) with 
the property that if we compute the vertical decomposition of each super-cell in the arrangement of TZ, each 
of the resulting primitive super-cells (of constant complexity) is intersected by 0{n? /r) regions. 

Lemma 4.6 Given a graph G = iy,E), \V\ = N, \E\ = M, designated nodes s,t E V, and a set of 
k edges E' C E, s-t reachability in G can be maintained over edge insertions and deletions from E' in 
total time 0{viivii{N^ , Mk) + k'^U), where U is the number of such updates (uj is the exponent for matrix 
multiplication). 

Proof: Let V be the set of endpoints of edges in E' . We compute the graph G' = {V" = V U {s, t}, E"), 
where (n, v) G E" if there is a directed path from u to w in G. Note that \V"\ < 2k. The computation of 
this graph can be done by performing a full transitive closure on G that takes time 0{n'^). Alternatively, we 
can perform 0{k) depth-first searches (one from each vertex in V") to construct G'. 

Now, to process updates, we update the graph using a standard dynamic update procedure that takes 
time 0{k'^ log k) time (amortized) per update [p4l], yielding the result. ■ 

The algorithm now proceeds as follows: Each primitive super-cell has a set of edges associated with it 
(one for each region that intersects it). We use the above lemma to perform an efficient dynamic reachability 
test for each cell of the original arrangement in this primitive super-cell. When we move to the next primitive 
super-cell, we recompute the induced graph and repeat the process. 

We now compute the value of r. The total number of cells in the arrangement is 0{n^) by Lemma ^.3 . 



There are O(r^n^log^r) primitive super-cells, each intersected by 0{n^/r) regions. Consider a single 
primitive super-cell i. We apply Lemma 16 with N = M = 0{n^), k = 0{n^/r), and U = Ui, where Ui 



is the number of cells in i. The current value of uj is approximately 2.376 [18], and thus min(A^'^, Mk) 
Mk = n^/r for all r = $1(1). The cost of processing i is therefore n^/r + n^Ui/r"^. Summing over all 
primitive super-cells, and replacing Sf7j by 0{n^), we obtain the overall running time of the algorithm to 
be 0{rfir log^ r + n^^/r^). Balancing, we obtain an overall running time of 0{n^^polylogn). 

Theorem 4.7 Given two polygonal chains P, Q,\P\ = p,\Q\ = q, and e > 0,we can check ifdpiP, Q) < £ 
in time 0{n^^polylogn). 

The weak Frechet distance As described earlier, the weak Frechet distance (denoted by dp) relaxes the 
constraint that the parametrizations employed must be monotone. Note that for any two curves P, Q, the 
following inequality is ti^ue: dniP, Q) < dp{P, Q) < dpiP, Q) Also, by the result of Godau [^], all three 
measures collapse to one if both curves are convex. The above inequality is significant because it suggests 
that the weak Frechet distance may serve as a relaxed curve matching measure with possibly more tractable 
algorithms. 

As it turns out, this is indeed the case. Our techniques from the previous algorithm apply here as well, 
with two key differences. Firstly, since the paths need not be monotone, we no longer need the concept 
of a tunnel, thus reducing the number of critical events that need to be examined to 0{pq). Secondly, the 
underlying graph is now undirected, and there are efficient procedures for maintaining connectivity in an 



undirected graph [22]. We defer details to a full version of the paper, and summarize the result as: 



Theorem 4.8 Given two polygonal chains P,Q,\P\ = p,\Q\ = q, and e > 0, we can check if mint dFiP,Q+ 
t) < s in time 0{ri^polylogn), where n = 0{p + q). 

An approximation scheme An (e, /?) -approximation (defined by Heffeman and Schirra [^]) for dp {P, Q) 
under translations can be obtained from the following observation: 

Lemma 4.9 Given polygonal chains P, Q, let t be the translation that maps the first point of Q to the first 
point of P. Then dpiP, Q + t) < 2d*, where d* = ^^translations t dp^P, Q + t). 
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Proof: Let t* be the translation such that dF{P, Q + t*) = d*. Clearly, the first point in Q is at most d* 
away from the first point of P. Applying the translation t' = t — t* to Q, no point in Q is moved more than 
d* units away from its associated point in P. Hence, dpiP, Q + t* + t') = dp^P, Q + t) < 2d*. ■ 

Applying the standard discretization trick in a ball of radius d* around the first point of P, we obtain an 
(e, /9) -approximation for any /3 > 0. Note that this scheme is very efficient, running in time 0{v?poly{\og n, 
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A Proof of Lemma 1271 



Definition A. 1 For a geometric object R let X{R), the x-span of R, denote the interval of the x-axis 
between the leftmost and the rightmost point ofR', where R' is the orthogonal projection ofR on the x-axis. 

Claim A.2 Let P = {{xi,yi), . . . {xm,ym)} be a point set. We can construct in time 0{m log m) a data 
structure for P such that given a query segment s, the point {xk,yk) that maximizes the y-value of the set 
{s{xi) + yi I Xj S X{s), 1 < « < m} can be found in time 0(log m). 

Proof: If X(P) C X(s), then {xk,yk) is clearly a vertex of the convex hull of P, and once the convex 
hull is computed, we can find {xk,yk) in time O(logn). To answer the query in the case that X{P) is 
not contained in X{s), we construct a sorted balanced binary tree "^ = ^(-P) on the set {xi . . . Xm}- For 
each node ^ G ^ let P^ denote the points in the subtree of n, and let X^ denote the x-span of P^. We 
construct C^, the convex hull of P^, for each node /i of 'I'. Once a query segment s is given, we find a set 
U of 0(log \P\) nodes of ^ with the property that for each node fj, ^ U, X^ is contained in X{s), and in 
addition, each {xi,yi) G P for which Xi G X{s) appears in exactly one of the sets P^, for // G C/. We 
perform the query suggested by the previous claim on C^ for each fi ^ U. ■ 



Based on Claim |A.2| , we describe the data structure as follows. Let m = \S\. First observe that the 
maximum must be obtained at an endpoint of a segment of S. We partition S into Si and 52- The set 
^2 contains at least m — ^Jm of the segment of 5. It is updated after ^Jm insertions or deletion operations 
into/from S Once it is updated, we explicitly compute the function surasx (■)> ^^d construct the data structure 



^ = '^Sx of Claim K!l for the vertices of the graph of sums^ (•)■ As easily observed, the complexity of the 
graph of sums^ (•) is 0(m), since a vertex of this function occurs only at endpoint of a segment of S\, thus 
the time needed to constuct ^ = ^5^ . The set 82 = S\S\ has cardinality < ^/ra. Each time a segment is 
inserted (resp. deleted) into/from S, it is inserted (resp. deleted) into/from S\. Once the size of S\ exceeds 
\/rn, we set S\ to be 5, construct ^, and empty 52- 

In order to maintain the maximum max(surn5(-)), we do the following. Once a segment is inserted 
or deleted into S\, we explicitly compute (the graph of) s\xvas{') which is piecewise linear of complexity 
0{^\fm,). With each segment e of this graph (not to be confused with the segments of S) we perform a query 
in '^Sx- The maximum obtained is is max(sum5(-)). 

Next we describe the modifications of the data structure needed in the case where (some of) the segments 
of S move vertially in a constant speed with the time parameter r. Let X' = {xi . . . x^} denote the x- 
coordinates of the endpoints of the segments of S. They are not time dependent. Let y{x, r) denote the 
y-value of the sum function at the coordination x at time r. Clearly as long as no insertions or deletions 
are taken place in S, y{x, r) moves (vertically) at a constant velocity. It is well known fact that the convex 
hull of such a set of points can go through 0(m) combinatorial changes, which we can compute in time 
0(?TT, log m). This suggest the following modification to the data structure of T as follows. As before, each 
node /i is associated as before with the convex hull C^ = Cf^{t), but now these convex hulls might change 
in time. However, as argued, the total number of changes they go through is only 0(m, log^ m). The query 
process remains the same. 



B Proof of Theorem B:^ 



Assume for the construction that e = 1/2. The first component in the construction (see Figure |3|) is the set 
B'l consisting of 2n points, which are 

{{i, 1/2 - i/n) and {i, -1/2 - i/n - l/Ari^), fon = l...n} . 

Thus the i^^ pair (i, 1/2 — i/n)~^ and (i, —1/2 — i/n — 5)+ (i.e., the Minkowski sum of these points and 
the iiufty ball) form two close vertically aligned squares, where the gap between them is of unit width, and 
of height l/4n^. The i^^ pair is located at distance i/n below the x-axis. We add the segment B'{, which 
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y-axis 



B^ is a segment of length n 



A2 is a set of n point, 

whose distance is 1/n from each other 



B^ is a segment of length n 



\Ai consists of a set of n segment 
of length 2n. The vertical distance 
between consequtive segments is 1/n 











f — 

• • • 



B'^ consists of n pais of points 



i?4 consists of n points (not shown), which 
are the centers of n unit squares. 



Figure 3: The lower bound construction for n = 3. The set B is not shown explicitly; only B^ is shown. 

is the long horizontal segment between the points {—n, —1/4) and (0, —1/4) and the segment B'l between 
(n, -1/4) and (2n, -1/4). Let Bi = B[U B'{ U B'{'. 

The set Ai consists of n horizontal segments of length 2n, each separated by a gap of If-n? from the 
next one. The left endpoint of all of them is on the y-axis, and the middle one is on the x-axis. By shifting 
them vertically, each segment in turn is not completely covered at some time, when it passes between the 
gaps between one of the pairs of Bi. In all other cases, all the segments are completely covered. The region 
in TP con^esponds to all translations t for which h{t + Ai, Bi) < 1 consists of Q,{n?) horizontal strips, 
each of length n. 

The set B2 consists of the n points (—(1 + l/n'^)i, —5) (for i = 1 . . . n). Thus S^ creates n unit 
squares along the line y = —5, with a gap of 1/n^ between them. The set Ai consist of n points along 
the horizontal line (— l/2n, —5) (for i = I . . .n). Observe that Ai fits completely into each of the squares 
of B2. However, by sliding Ai horizontally, along y = —5 or anywhere at distance < 1 from h, each of 
the points of Ai "falls" at some stage into each of the gaps between each of the squares of i?^. The region 
82 = {t\ h{t + A2, B2) < 1} consists of Q{n'^) vertical strips in TP , each of hight 2. Letting A = A1UA2 
and B = BiU B2, the region S = {t\ h{t + A, B) < 1} is merely the intersection of Si and 52, which is 
clearly of complexity Q,{n^), thus proving our claim. 



C Proof of Theorem HTl 



Suppose Vp is reachable from Vqq and f{v\ 



00^ 



(0,0), f{v^) = {p,q). Let the path in G be wi 



V^Q,V2,...,Vk 



pq 



. Replace each vertex Vi by its associated point f{vi). As observed above, if we now 
connect the points f{vi), f{v2), • • • , f{vk) by straight lines, we obtain an (x, y)-monotone path. 

Conversely, suppose there exists an (x, y)-monotone path w from (0, 0) to {p, q) in F^. Then (0, 0) E 
Coo and {p, q) G Cp„i_g_i and thus /(^^qo) — (^i 0) and f{Vpq) = (p, q). Without loss of generality, we can 
assume that w consists of a sequence of line segments, where the endpoints of each segment are one of the 
Xij's (x = {a, b,c, d}). 

We will show by induction on the number of segments that v^^ is reachable from fgo- Assume that the 
claim holds for the first k segments on the path. Consider the {k + l)*'^ segment. Let the endpoints be 
wi,W2- By the induction hypothesis, wi is reachable from Vqq. 

Case 1: Let both wi,W2 be of the form Xij,ykj respectively, where x, y G {a, b}. If it{f{wi)) > k, 
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then the vertex tf ^ exists for all / < k, and thus there exists a path wi,tf, ^^i, . . . , tf^^. Since /{tfjjj is 
on the same interval as f{w2) and must be below it, there exists an edge from t^-^ to W2 in E2 ■ If on the 

other hand, rt{f{wi)) < k, there must exist one vertex w' = x^^ ,i < I < k such that f{w') > f{wi), and 
rt(/(ti;i) < /. We construct a path from wi to w' and repeat. 

Case 2: Let both wi and W2 be of the form Xij, yik respectively, where x,y G {c,d}. An argument 
similar to Case 1 applies here. 

Case 3: Let wi = aij and W2 = dki- Without loss of generality we can assume that k = i and I = j + 1. 
There exists an edge from vf, to v^A, which is a predecessor of vf ,^1) (using E4), and there exists an edge 
from f? _,_^) to vfi, thus yielding the desired path. Other cases can be handled symmetrically. 

Thus, by induction the theorem holds. 



